sequential context encoding
Reviews: Sequential Context Encoding for Duplicate Removal
This paper proposes a new Duplicate Removal method based on RNN. Based on each candidate area, informative features are extracted by using appearance feature, position and ranking information in addition to the score. Then, they are treated as series data and are input into the RNN-based model to improve the final accuracy by capturing global information. The number of candidate regions is enormous to the number of objects that are to be left. Therefore, this paper proposes to reduce the box gradually by dividing it into two stages. In the two stages, the RNN model of the same structure was used. In stage I, to remove simple boxes the model is trained by using NMS results as a teaching signal. In stage II, to remove difficult boxes, the model is trained by using the grand-truth boxes. Experiments showed that mAP is increased in the SOTA object detection methods (FPN, Mask R - CNN, PANet with DCN) with the proposed method.
Sequential Context Encoding for Duplicate Removal
Qi, Lu, Liu, Shu, Shi, Jianping, Jia, Jiaya
Duplicate removal is a critical step to accomplish a reasonable amount of predictions in prevalent proposal-based object detection frameworks. In this work, we design a new two-stage framework to effectively select the appropriate proposal candidate for each object. The first stage suppresses most of easy negative object proposals, while the second stage selects true positives in the reduced proposal set. These two stages share the same network structure, an encoder and a decoder formed as recurrent neural networks (RNN) with global attention and context gate. The encoder scans proposal candidates in a sequential manner to capture the global context information, which is then fed to the decoder to extract optimal proposals.